Ifn/enit - Database of Handwritten Arabic Words
نویسندگان
چکیده
In this paper we are presenting a new database with handwritten Arabic town/village names. For each name the ground truth information, e.g. the sequence of character shapes, some style information, and the baseline are coded. 411 writers filled forms with about 26400 names containing more than 210000 characters. The database is described in detail. It is designed for training and testing recognition systems for handwritten Arabic words. The IFN/ENIT-database is available for the purpose of research. RÉSUMÉ : Dans cet article on présente une novelle base de données, qui contient des noms manuscrits de villes/villages arabes. Pour chaque nom les informations de base, par exemple l'ordre des formes de caractère, les informations sur le style de l'écriture, et la ligne de base sont codées. 411 auteurs ont rempli des formulaires avec plus de 26400 noms contenant plus de 210 000 caractères. La base de données est décrite en détail, et elle est conçue pour la formation et l'essai des systèmes d'identification pour les mots arabes manuscrits. La base de données -IFN/ENIT est disponible pour la recherche.
منابع مشابه
HMM Based Approach for Handwritten Arabic Word Recognition Using the IFN/ENIT- Database
An offline recognition system for Arabic handwritten words is presented. The recognition system is based on a semi-continuous 1-dimensional HMM. From each binary word image normalization parameters were estimated. First height, length, and baseline skew are normalized, then features are collected using a sliding window approach. This paper presents these methods in more detail. Some parameters ...
متن کاملComponent-based Segmentation of Words from Handwritten Arabic Text
Efficient preprocessing is very essential for automatic recognition of handwritten documents. In this paper, techniques on segmenting words in handwritten Arabic text are presented. Firstly, connected components (ccs) are extracted, and distances among different components are analyzed. The statistical distribution of this distance is then obtained to determine an optimal threshold for words se...
متن کاملComparison of Two Different Feature Sets for Offline Recognition of Handwritten Arabic Words
Normalization is a very important step in automatic cursive handwritten word recognition. Based on an offline recognition system for Arabic handwritten words which uses a semi-continuous 1-dimensional HMM recognizer two different feature sets are presented. The dependencies of the feature sets from normalization steps is discussed and their performances are compared using the IFN/ENIT database ...
متن کاملAutomatic Segmentation for Arabic Character Handwriting
The cursive and ligature nature of the Arabic language make the segmentation of words into individual characters a difficult task. Despite attempts to apply methods for cursive Latin and other languages to Arabic, it is generally insufficient to segment Arabic text. This paper proposes a new segmentation algorithm for handwritten Arabic text and the main idea consist of segmenting the word into...
متن کاملWord-Based Handwritten Arabic Scripts Recognition Using Dynamic Bayesian Network
In this paper, multi-class classification system is of handwritten Arabic words using Dynamic Bayesian Network (DBN) is proposed, in which technical details are presented in terms of three stages, i.e. preprocessing, feature extraction and classification. Firstly, words are segmented from inputted scripts and also normalized in size. Then, features are extracted from each normalized word, where...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002